Instruction fetch architectures and code layout optimizations
نویسندگان
چکیده
منابع مشابه
Branch Classification to Control Instruction Fetch in Simultaneous Multithreaded Architectures
In Simultaneous Multithreaded architectures many separate threads are running concurrently, sharing processor resources, thereby realizing a high utilization rate of the available hardware. However, this also implies that threads are competing for resources and in many cases this competition can actually degrade overall performance. There are two major causes for this: first, instructions that,...
متن کاملThe Effect of Code Expanding Optimizations on Instruction Cache Design
This paper shows that code expanding optimizations have strong and non-intuitive implications on instruction cache design. Three types of code expanding optimizations are studied in this paper: instruction placement, function inline expansion, and superscalar optimizations. Overall, instruction placement reduces the miss ratio of small caches. Function inline expansion improves the performance ...
متن کاملThe Eeect of Code Expanding Optimizations on Instruction Cache Design
This paper shows that code expanding optimizations have strong and non-intuitive implications on instruction cache design. Three types of code expanding optimizations are studied in this paper: instruction placement, function inline expansion, and superscalar optimizations. Overall, instruction placement reduces the miss ratio of small caches. Function inline expansion improves the performance ...
متن کاملFast & Accurate Instruction Fetch and Branch Prediction
Accurate branch prediction is critical to performance; mispredicted branches mean that ten’s of cycles may be wasted in superscalar architectures. Architectures combining very effective branch prediction mechanisms coupled with modified branch target buffers (BTB’s) have been proposed for wide-issue processors. These mechanisms require considerable processor resources. Concurrently, the larger ...
متن کاملSingle Instruction Fetch Does Not Inhibit Instruction-Level Parallelism
Superscalar machines fetch multiple scalar instructions per cycle from the instruction cache. However, machines that fetch no more than one instruction per cycle from the instruction cache, such as Dynamic Trace Scheduled VLIW (DTSVLIW) machines, have shown performances comparable to that of Superscalars. In this paper, we present experiments that show that fetching a single instruction from th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the IEEE
سال: 2001
ISSN: 0018-9219
DOI: 10.1109/5.964440